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State Space Reconstruction for Multivariate Time Series Prediction 
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In the nonlinear prediction of scalar time series, the common practice is to reconstruct the state 
space using time-delay embedding and apply a local model on neighborhoods of the reconstructed 
space. The method of false nearest neighbors is often used to estimate the embedding dimension. 
For prediction purposes, the optimal embedding dimension can also be estimated by some prediction 
error minimization criterion. We investigate the proper state space reconstruction for multivariate 
time series and modify the two abovementioned criteria to search for optimal embedding in the set 
of the variables and their delays. We pinpoint the problems that can arise in each case and compare 
the state space reconstructions (suggested by each of the two methods) on the predictive ability of 
the local model that uses each of them. Results obtained from Monte Carlo simulations on known 
chaotic maps revealed the non-uniqueness of optimum reconstruction in the multivariate case and 
showed that prediction criteria perform better when the task is prediction. 

PACS numbers: 05.45.Tp, 02.50.Sk, 05.45.a 

Keywords: nonlinear analysis, multivariate analysis, time series, local prediction, state space reconstruction 



I. INTRODUCTION 

Since its publication Takens' Embedding Theorem [Tj 
(and its extension, the Fractal Delay Embedding Preva- 
lence Theorem by Sauer et al. 0) has been used in time 
series analysis in many different settings ranging from 
system characterization and approximation of invariant 
quantities, such as correlation dimension and Lyapunov 
exponents, to prediction and noise-filtering The Em- 
bedding Theorem implies that although the true dynam- 
ics of a system may not be known, equivalent dynamics 
can be obtained under suitable conditions using time de- 
lays of a single time series, treated as an one-dimensional 
projection of the system trajectory. 

Most applications of the Embedding Theorem deal 
with univariate time series, but often measurements of 
more than one quantities related to the same dynamical 
system are available. One of the first uses of multivari- 
ate embedding was in the context of spatially extended 
systems where embedding vectors were constructed from 
data representing the same quantity measured simulta- 
neously at different locations [1, Multivariate em- 
bedding was used for noise reduction [3| and for surro- 
gate data generation with equal individual delay times 
and equal embedding dimensions for each time series Q • 
In nonlinear multivariate prediction, the prediction with 
local models on a space reconstructed from a different 
time series of the same system was studied in [8:]. This 
study was extended in 19\ by having the reconstruction 
utilize all of the observed time series. Multivariate em- 
bedding with the use of independent components analysis 
was considered in and more recently multivariate em- 
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bedding with varying delay times was studied in [111 . Il2l | . 

In this work, we focus on the state space reconstruc- 
tion from multivariate time series from discrete dynami- 
cal systems, so that the investigation of optimal embed- 
ding does not involve the delay parameter but only the 
embedding dimension for each variable. For this, we ad- 
just two well-known approaches used for univariate time 
series, i.e. the false nearest neighbor [I^ and the opti- 
mal reconstruction for prediction evaluated in a test set 
[3, [iBl ■ We study the consistency of the techniques in 
estimating the embedding dimensions as well as their per- 
formance by means of out-of-sample prediction using the 
selected embedding. Monte Carlo simulations at differ- 
ent settings of system complexity, system dimension and 
time series lengths are used to evaluate the embedding 
techniques. 

In Section[TT]the embedding for univariate time series is 
briefly discussed. In Section HIIl the discussion is extended 
to multivariate time series and the suggested techniques 
for estimating the embedding are presented. Then in 
Section IIVI the results of Monte Carlo simulations are 
presented and in Section |V] the results are discussed and 
conclusions are given. 



II. UNIVARIATE EMBEDDING 

A dynamical system generates a trajectory in a D- 
dimensional manifold F. For discrete time the dynamical 
system is defined by the ZJ-dimensional map F : F F 
as 

y„+i = F(y„), neN, 

where y„ € F is the state vector at time step n. 

The observed scalar time series {xn}n=i of length N 
is the projection of the segment of the system trajectory 
{ynlriLi given by a measurement function /i : F i-^ M 
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as Xn = h{yn). Despite the apparent loss of information 
of the system dynamics by the projection, the system 
dynamics may be recovered through suitable state space 
reconstruction from the scalar time series. 



A. Reconstruction of the state space 

According to Taken's embedding theorem a trajectory 
formed by the points x„ of time-delayed components 

N 



from the time series {xn\n=i 



(■^n — (m — l)r i •^n — {m—2)T: ■• '^n ) 7 



(1) 



under certain genericity assumptions, is an one-to-one 
mapping of the original trajectory of y„ provided that m 
is large enough. 

Given that the dynamical system "lives" on an attrac- 
tor A C r, the reconstructed attractor A through the 
use of the time-delay vectors is topologically equivalent 
to A. A sufficient condition for an appropriate unfolding 
of the attractor is m > 2(i+ 1 where d is the box-counting 
dimension of A. 

The embedding process is visualized in the following 
graph 

y„ e A c r ^ y„+i G A c r 

-i/i \-h 

G K Xn+1 G K 

ie ie 

x„ G i C R™ S x„+i G i C M™ 



where e is the embedding procedure creating the delay 
vectors from the time series and G is the reconstructed 
dynamical system on A. G preserves properties of the 
unknown F on the unknown attractor A that do not 
change under smooth coordinate transformations. 



B. Univariate local prediction 

For a given state space reconstruction, the local predic- 
tion at a target point x„ is made with a model estimated 
on the K nearest neighboring points to x„. The local 
model can have a simple form, such as the zeroth order 
model (the average of the images of the nearest neigh- 
bors), but here we consider the linear model 



where the superscript (n) denotes the dependence of the 
model parameters (a'^"^ and on the neighborhood 

of x„. The neighborhood at each target point is defined 
either by a fixed number K of nearest neighbors or by 
a distance determining the borders of the neighborhood 
giving a varying K with x„. 



C. Selection of embedding parameters 

The two parameters of the delay embedding in ([1]) are 
the embedding dimension m, i.e. the number of compo- 
nents in x„ and the delay time r. We skip the discussion 
on the selection of r as it is typically set to 1 in the case 
of discrete systems that we focus on. Among the ap- 
proaches for the selection of m we choose the most popu- 
lar method of false nearest neighbors (FNN) and present 
it briefly below p^ . 

The measurement function h projects distant points 
{Vn} of the original attractor to close values of {x„}. 
A small m may still give badly projected points and we 
seek the reconstructed state space of the smallest embed- 
ding dimension m that unfolds the attractor. This idea 
is implemented as follows. For each point x™ in the m- 
dimensional reconstructed state space, the distance from 
its nearest neighbor x™^^^ is calculated, (i(x™,x™j^j) = 
||x™ — x™j-^^||. The dimension of the reconstructed state 
space is augmented by 1 and the new distance of these 
vectors is calculated, d(x;7+i,x^(+i) = ||x;^'+i - x^^'^+^H. 
If the ratio of the two distances exceeds a predefined tol- 
erance threshold r the two neighbors are classified as false 
neighbors, i.e. 



nim) - 



d{^n,K(l)) '''' 



(2) 



The criterion that the embedding dimension m is high 
enough to unfold the attractor is that the percentage of 
points for which rn{m) > r, is essentially zero, typically 
requiring to be smaller than 1%. 

The selection of r should be large enough to allow for 
exponential divergence. In [l^, a stricter criterion is in- 
troduced, that the original distance of the point to its 
nearest neighbor in the m-dimensional space does not 
exceed the standard deviation of x„ divided by r. If it 
does the point is omitted from the percentage calcula- 
tion, since the points are already too far apart to be real 
neighbors. A good and often used value for r is 10. 

Another popular method for the selection of the em- 
bedding dimension m is from the optimization of the fit 
of a local linear model using a criterion for the goodness- 
of-fit or the goodness-of-prediction [l^. The idea 
here is that for a local linear model fit to be optimum 
the attractor must be fully unfolded. After the selection 
of an appropriate r, for state space reconstructions with 
m varying from 1 to a maximum mmax, the fit or predic- 
tion error of local prediction models is calculated. For the 
errors a statistic such as the normalized root mean square 
error (NRMSE) is used and the embedding dimension is 
chosen as the one that minimizes this statistic. Whereas 
the false nearest neighbors method determines a minimal 
sufficient embedding dimension, this method picks a di- 
mension for which the attractor is unfolded (so as to give 
better predictions) that may be larger than the minimal. 
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III. MULTIVARIATE EMBEDDING 



B. Multivariate local prediction 



In Section [IT] we gave a summary of the reconstruction 
technique for a deterministic dynamical system from a 
scalar time series generated by the system. However, it is 
possible that more than one time series are observed that 
are possibly related to the system under investigation. 
For p time series measured simultaneously from the same 
dynamical system, a measurement function H : F i— > R^* 
is decomposed to hi, i = 1, ... ,p, defined as in SectionlTTl 
giving each a time series {xi^n}n=i- According to the dis- 
cussion on univariate embedding any of the p time series 
can be used for reconstruction of the system dynamics, 
or better, the most suitable time series could be selected 
after proper investigation. In a different approach all the 
available time series are considered and the analysis of 
the univariate time series is adjusted to the multivariate 
time series. 



A. From univariate to multivariate embedding 



The prediction for each time series Xi^n, i ~ I7 • • • iPj is 
performed separately by p local models, estimated as in 
the case of univariate time series, but for reconstructed 
points formed potentially from all p time series as given 
in © (e.g. see 0). 

We propose an extension of the NRMSE for the pre- 
diction of one time series to account for the error vec- 
tors comprised of the individual prediction errors for 
each of the predicted time series. If we have one step 
ahead predictions for the p available time series, i.e. ii.„, 
i = 1, . . . ,p (for a range of current times n — 1), we define 
the multivariate NRMSE 



NRMSE 



■ 1 •^p,n 



\ n 



(4) 



where Xi is the mean of the actual values of Xi^n over all 
target times n. 



Given that there are p time series {xi^n}n=i^ * 

the equivalent to the reconstructed state vec- 
tor in ll]) for the case of multivariate embedding is of the 
form 



(-^l,?! — (mi — l)ri ; -^l-n — (mi — 2)ri 7 ■■ •^1,715 

*^2,n — (m2 — l)r2 i *^2,ni ■■■i ^p.n) 



(3) 



and are defined by an embedding dimension vector m = 
(mi,...,mp) that indicates the number of components 
used from each time series and a time delay vector 
T = (ti, Tp) that gives the delays for each time series. 
The corresponding graph for the multivariate embedding 
process is shown below. 



Yn&A C F 

hi i/i2 ■■• \j/ip 
^l,n •^2,n ■ ■ ■ ^p,n 
\je ie ■■■ 1/ e 



eA c 



pM 



^ y„+i eAcr 

v/ hi i/l2 ■■• \j?lp 
\ie ie ■■■ e 

S x„+i e i c 



The total embedding dimension M is the sum of the 
individual embedding dimensions for each time series 
M = X]r=i"^i- Note that if redundant or irrelevant 
information is present in the p time series, only a sub- 
set of them may be represented in the optimal recon- 
structed points x„ . The selection of m and t follows the 
same principles as for the univariate case: the attrac- 
tor should be fully unfolded and the components of the 
embedding vectors should be uncorrelated. A simple se- 
lection rule suggests that all individual delay times and 
embedding dimensions are the same, i.e. m — ml and 
r = t1 with 1 a p- vector of ones [g, 0|. Here, we set 
again = 1 , i 1 , . . . , p, but we consider both fixed and 
varying in the implementation of the FNN method 
(see Section IhTD]) . 



C. Problems and restrictions of multivariate 
reconstructions 

A major problem in the multivariate case is the prob- 
lem of identification. There are often not unique m and 
r embedding parameters that unfold fully the attractor. 
A trivial example is the Hcnon map [171] 



1 = l.A-xl 

Hn+l = O.iXn 



Vn 



(5) 



It is known that for the state space reconstruction from 
the observable Xn the appropriate embedding parame- 
ters are m — 2 and t — 1. Due to the fact that i/n 
is a lagged multiple of Xn the attractor can obviously 
be reconstructed from the bivariate time series {a;„,y„} 
equally well with any of the following two-dimensional 
embedding schemes 



{Xji , Xn- 



{Xn, Vn) 



{Vn,yn-l) 



since they are essentially the same. This example shows 
also the problem of redundant information, e.g. the state 
space reconstruction would not improve by augmenting 
the delay vector x„ = (a;„, Xn-i) with the component ?/„ 
that actually duplicates Xn-i- Redundancy is inevitable 
in multivariate time series as synchronous observations of 
the different time series are generally correlated and the 
fact that these observations are used as components in 
the same embedding vector adds redundant information 
in them. We note here that in the case of continuous 
dynamical systems, the delay parameter Ti may be se- 
lected so that the components of the i time series are not 
correlated with each other, but this does not imply that 
they are not correlated to components from another time 
series. 
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A different problem is that of irrelevance, when ti 
series that are not generated by the same dynamical s 
tem are included in the reconstruction procedure. T 
may be the case even when a time series is connected t 
time series generated by the system under investigatii 

An issue of concern is also the fact that multivari 
data don't always have the same data ranges and c 
tances calculated on delay vectors with components 
different ranges may depend highly on only some of i 
components. So it is often preferred to scale all the data 
to have either the same variance or be in the same data 
range. For our study we choose to scale the data to the 
range [0, 1]. 



D. Selection of the embedding dimension vector 

Taking into account the problems in the state space 
reconstruction from multivariate time series, we present 
three methods for determining m, two based on the false 
nearest neighbor algorithm, which we name FNNl and 
FNN2, and one based on local models which we call pre- 
diction error minimization criterion (PEM). 

The main idea of the FNN algorithms is as for the 
univariate case. Starting from a small value the embed- 
ding dimension is increased by including delay compo- 
nents from the p time series and the percentage of the 
false nearest neighbors is calculated until it falls to the 
zero level. The difference of the two FNN methods is on 
the way that m is increased. 

For FNNl we restrict the state space reconstruction to 
use the same embedding dimension for each of the p time 
series, i.e. m = {m,m, ...,m) for a given m. To assess 
whether m is sufficient, we consider all delay embeddings 
derived by augmenting the state vector of embedding di- 
mension vector (m, m, ...,m) with a single delayed vari- 
able from any of the p time series. Thus the check for 
false nearest neighbors in ^ yields the increase from 
the embedding dimension vector {m,m,...,m) to each 
of the embedding dimension vectors (m -I- l,m,...,m), 
(m, TO -I- 1, to), . . ., (to, to, TO -I- 1). Then the algo- 
rithm stops at the optimal m = (to, to, to) if the zero 
level percentage of false nearest neighbors is obtained for 
all p cases. A sketch of the first two steps for a bivariate 



time series is shown in Figure 1(a) 

This method has been commonly used in multivariate 
reconstruction and is more appropriate for spatiotem- 
porally distributed data (e.g. see the software package 
TISEAN dll). A potential drawback of FNNl is that 
the selected total embedding dimension M is always a 
multiple of p, possibly introducing redundant informa- 
tion in the embedding vectors. 

We modify the algorithm of FNNl to account for any 
form of the embedding dimension vector m and the total 
embedding dimension M is increased by one at each step 
of the algorithm. Let us suppose that the algorithm has 
reached at some step the total embedding dimension M. 
For this M all the combinations of the components of the 
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(0,1) 


M=2 




Cl,,1^ ^0,2) 








M=3 


(3,0) (2,!) 


(1,2) (0,3) 
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FIG. 1: Example of the first two steps of FNNl (a) and FNN2 
(b) for a bivariate time series 



embedding dimension vector m = (toi, m2, TOj,) are 
considered under the condition AI — X^f^i "^j- Then for 
each such m = (toi, TO2, m^) all the possible augmen- 
tations with one dimension are checked for false nearest 
neighbors, i.e. (toi -|- 1, TO2, nip), (toi, TO2 -I- 1, TOp), 
. . ., (toi, TO2, TOp -|- 1). A sketch of the first two steps 
of the extended FNN algorithm, denoted as FNN2, for a 



bivariate time series is shown in Figure 1(b) 

The termination criterion is the drop of the percent- 
age of false nearest neighbors to the zero level at every 
increase of M by one for at least one embedding dimen- 
sion vector (toi , TO2 , . . . , rup) . If more than one embedding 
dimension vectors fulfill this criterion, the one with the 
smallest cumulative FNN percentage is selected, where 
the cumulative FNN percentage is the sum of the p FNN 
percentages for the increase by one of the respective com- 
ponent of the embedding dimension vector. 

The PEM criterion for the selection of m = 
(toi,TO2, ...,TOp) is simply the extension of the goodness- 
of-fit or prediction criterion in the univariate case to 
account for the multiple ways the delay vector can be 
formed from the multivariate time series. Thus for 
all possible p-plets of (toi, TO2, TOp) from (1,0,...,0), 
(0, 1, ...,0), etc up to some vector of maximum embed- 
ding dimensions (mmax, "^max, ■ • ■ , "^max), the respective 
reconstructed state spaces are created, local linear mod- 
els are applied and out-of-sample prediction errors are 
computed. So, totally p™""" — 1 embedding dimension 
vectors are compared and the optimal is the one that 
gives the smallest multivariate NRMSE as defined in 



IV. MONTE CARLO SIMULATIONS AND 
RESULTS 

A. Monte Carlo setup 

We test the three methods by performing Monte Carlo 
simulations on a variety of known nonlinear dynamical 
systems. The embedding dimension vectors are selected 
using the three methods on 100 different realizations of 
each system and the most frequently selected embedding 
dimension vectors for each method are tracked. Also, for 
each realization and selected embedding dimension vec- 
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Estimate Dimension 
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FNN~\ 






PEM learning set 


test set 





FIG. 2: Sketch of the split of data for the selection of the 
embedding dimension vector with FNN and PEM. 



TABLE L Dimension vectors and NRMSE for the Ikeda map. 
Columns 2,3 and 4 contain the embedding dimension vectors 
followed by their respective frequency of occurrence 





Embedding dimensions 


NRMSE 


TV 


FNNl 


FNN2 


PEM 


FNNl 


FNN2 


PEM 


512 


(1,1) 100 


(1,1) 100 


(2,2) 81 
(1,2) 13 


0.051 


0.051 


0.032 


2048 


(1,1) 100 


(1,1) 100 


(2,2) 100 


0.028 


0.028 


0.009 


8192 


(1,1) 100 


(1,1) 100 


(2,2) 100 


0.013 


0.013 


0.003 



tor from each method, the multivariate NRMSE for out- 
of-samplc prediction is computed. The average multivari- 
ate NRMSE over the 100 realizations for each method 
is then used as an indicator of the performance of each 
method in prediction. 

The selection of the embedding dimension vector by 
FNNl, FNN2 and PEM is done on the first three quarters 
of the data, Ni = 3iV/4, and the multivariate NRMSE is 
computed on the last quarter of the data {N — Ni). For 
PEM, the same split is used on the iVi data, so that N2 — 
3Ni/4: data are used to find the neighbors (training set) 
and the rest N1—N2 are used to compute the multivariate 
NRMSE (test set) and decide for the optimal embedding 
dimension vector. A sketch of the split of the data is 
shown in Figure O The number of neighbors for the local 
models in PEM varies with N and we set Kn — 10, 25, 50 
for time series lengths N — 512, 2048, 8192, respectively. 
The parameters of the local linear model are estimated by 
ordinary least squares. For all methods the investigation 
is restricted to m,„ax — 5. 

The multivariate time series are derived from nonlin- 
ear maps of varying dimension and complexity as well as 
spatially extended maps. The results are given below for 
each system. 

B. One and two Ikeda maps 

The Ikeda map is an example of a discrete low- 
dimensional chaotic system in two variables {xn,yn) de- 
fined by the equations [19] 

Zn+i = 1 + 0.9exp(0.4i - 6i/(l + |z„p)), 
x„ = Re(2„), Hn = Im(2;„), 

where Re and Im denote the real and imaginary part, re- 
spectively, of the complex variable Zn. Given the bivari- 
ate time series of {xn,yn), both FNN methods identify 
the original vector x„ — (a;„, y„) and find m = (1,1) as 
optimal at all realizations, as shown in Table |T1 

On the other hand, the PEM criterion finds over- 
embedding as optimal, but this improves slightly the pre- 
diction, which as expected improves with the increase of 
N. 

Next we consider the sum of two Ikeda maps as a more 
complex and higher dimensional system. The bivariate 



TABLE II: Dimension vectors and NRMSE for the sum of 
two Ikeda maps 





Embedding dimensions 


NRMSE 


N 


FNNl 


FNN2 


PEM 


FNNl 


FNN2 


PEM 


512 


(2.2) 89 

(3.3) 11 


(2.2) 65 

(1.3) 26 


(2,2) 63 
(1,2) 34 


0.456 


0.480 


0.447 


2048 


(3,3) 95 
(2,2) 3 


(2,3) 43 
(3,2) 24 


(2,3) 54 
(2,2) 44 


0.339 


0.365 


0.329 


8192 


(3,3) 100 


(2.3) 43 

(1.4) 37 


(2,3) 100 


0.260 


0.304 


0.251 



time series are generated as 

Xn = Re(zi,„ -I- Z2,„), Vn = Ini(zi^„ -I- Z2^„). 

The results of the Monte Carlo simulations shown in Ta- 
ble [n] suggest that the prediction worsens dramatically 
from that in Table U and the total embedding dimension 
M increases with N. 

The FNN2 criterion generally gives multiple optimal 
m structures across realizations and PEM does the same 
but only for small N. This indicates that high complex- 
ity degrades the performance of the algorithms for small 
sample sizes. PEM is again best for predictions but over- 
all we do not observe large differences in the three meth- 
ods. 

An interesting observation is that although FNN2 finds 
two optimal m with high frequencies they both give the 
same M. This reflects the problem of identification, 
where different m unfold the attractor equally well. This 
feature cannot be observed in FNNl because the FNNl 
algorithm inspects fewer possible vectors and only one 
for each M, where M can only be multiple of p (in this 
case (1, 1) for M=2, (2,2) for M=4, etc). On the other 
hand, PEM criterion seems to converge to a single m for 
large N, which means that for the sum of the two Ikeda 
maps this particular structure gives best prediction re- 
sults. Note that there is no reason that the embedding 
dimension vectors derived from FNN2 and PEM should 
match as they are selected under different conditions. 
Moreover, it is expected that the m selected by PEM 
gives always the lowest average of multivariate NRMSE 
as it is selected to optimize prediction. 
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TABLE III: Dimension vectors and NRMSE for the KDR map 





Embedding dimensions 


NRMSE 


N 


FNNl 


FNN2 


PEM 


FNNl 


FNN2 


PEM 


512 


(1,1,1,1) 100 


(0,0,2,2) 30 (1,1,1,1) 16 


(0,1,0,1) 80 (0,1,1,1) 14 


0.776 


0.907 


0.629 


2048 


(1,1,1,1) 55 (2,2,2,2) 39 


(1,1,1,1) 37 (1,0,1,2) 21 


(0,2,1,1) 79 (0,1,0,1) 13 


0.636 


0.659 


0.486 


8192 


(2,2,2,2) 85 (1,1,1,1) 15 


(2,1,1,1) 40 (1,1,1,1) 14 


(0,2,1,1) 100 


0.558 


0.551 


0.373 



TABLE IV; Dimension vectors and NRMSE for system of Driver- Response Henon system 







Embedding dimensions 


NRMSE 


N 


C 


FNNl 


FNN2 


PEM 


FNNl 


FNN2 


PEM 


512 





(2,2) 100 


(2,2) 98 (1,2) 1 


(2,2) 75 (2,1) 10 


0.190 


0.196 


0.198 


0.4 


(2,2) 100 


(1,2) 89 (2,2) 8 


(3,2) 33 (2,2) 25 


0.102 


0.127 


0.116 


0.8 


(2,2) 100 


(2,0) 99 (2,1) 1 


(3,0) 31 (0,3) 27 


0.014 


0.012 


0.005 


2048 





(2,2) 100 


(2,2) 100 


(2,2) 100 


0.093 


0.093 


0.093 


0.4 


(2,2) 100 


(1,2) 80 (2,2) 20 


(3,3) 45 (4,3) 45 


0.050 


0.084 


0.028 


0.8 


(2,2) 100 


(2,0) 99 (2,1) 1 


(0,3) 20 (3,0) 19 


0.007 


0.006 


0.001 


8192 





(2,2) 100 


(2,2) 100 


(2,2) 100 


0.051 


0.051 


0.051 


0.4 


(2,2) 100 


(1,2) 100 


(3,3) 72 (4,3) 25 


0.027 


0.027 


0.011 


0.8 


(2,2) 100 


(2,0) 98 (0,2) 2 


(0,4) 31 (4,0) 30 


0.002 


0.002 


0.001 



C. Kicked Double Rotor map 

The Kicked Double Rotor (KDR) map is a nonlinear 
chaotic system in four variables that describes the time 
evolution of the mechanical system with the same name 
dO]. The four time series {xi^n, X2.n,yi.n,y2.n) are gen- 
erated from the equations 

X„+i = MY„ + X„, Y„+i = LY„ + W(X„) 

where 

W(X„) = (6.36sin(xi,„),9sin(x2,„))^, 
^^/0.490.21\ ^^/0.240.27\ 
V0.210.70y V0-27 0.5iy 

The simulation results on four time series of KDR map 
in Table ITTTl are similar to those on the sum of two Ikeda 
maps. 

The selection of m from PEM outperforms the two 
FNN methods with respect to prediction and converges 
to a single optimal m with N. For smaller N there seems 
to be large diversity of selected m by all methods. 



D. Driver- Response Henon system 

The Driver-Response Henon system consists of two 
Henon maps where the first Henon map (the variables 
Xi^n and yi^n are defined as in ([5])) drives the second 



Henon map [21] as follows 

X2,n+1 = 1-4 - iCxi,nX2,n + (1 " C)xl ,^) + y2,n 
2/2, n+1 = 0-SX2^n 

for a driving strength C. We set C — 0, 0.4, 0.8 regard- 
ing three different states for the two systems: indepen- 
dent (C=0), moderately dependent (C=0.4) and strongly 
dependent (C=0.8). The results of the Monte Carlo sim- 
ulations for the bivariate time series of (xi^n, X2.n) are 
given in Table HVl 

First we observe that FNNl gives uniform results for 
all C and N. When the two time series are independent 
(C=0) there is actually no reason to apply multivariate 
embedding and all methods select for all N the same em- 
bedding dimension vector (2,2) (less than 100% frequency 
only for N=512 with FNN2 and PEM). Since the opti- 
mal embedding dimension for the Henon map is known 
to be 2 this result seems quite reasonable. 

When C=0.4 the moderate dependence of the second 
time series to the first affects the selection of the embed- 
ding dimension vector. FNN2 selects mostly m = (1,2), 
which means that this method detects that information 
of the driver time series is passed to the response, thus 
it utilizes more the response time series for unfolding the 
attractor. On the other hand PEM tends to select vectors 
with larger embedding dimension for the driver ((3,2) for 
N=512 and (4,3) for 7V=2048) because this information 
is more useful for prediction purposes. Also PEM gives 
over-embedding as for the sum of two Ikeda maps. 

The strong dependence of the second time series to the 
first when C=0.8 implies that the system is less complex 
and so a smaller M is needed for embedding. For all 
N, FNN2 almost always selects the vector (2,0), whereas 
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TABLE V: Dimension vectors and NRMSE for Lattice of 3 coupled Henon maps 







Embedding dimensions 


NRMSE 


N 


C 


FNNl 


FNN2 


PEM 


FNNl 


FNN2 


PEM 


512 


0.4 


(2,2,2) 94 (1,1,1) 6 


(2,1,1) 46 (1,1,2) 37 


(1,2,1) 29 (1,1,2) 23 


0.342 


0.298 


0.283 


0.8 


(2,2,2) 98 (1,1,1) 2 


(2,0,2) 91 (2,1,1) 4 


(2,0,2) 44 (2,1,1) 22 


0.294 


0.228 


0.210 


2048 


0.4 


(2,2,2) 100 


(1,2,1) 85 (1,1,2) 8 


(1,2,2) 34 (2,2,1) 30 


0.169 


0.203 


0.170 


0.8 


(2,2,2) 100 


(2,0,2) 65 (1,2,1) 14 


(2,1,2) 48 (2,0,2) 41 


0.119 


0.131 


0.112 


8192 


0.4 


(2,2,2) 100 


(1,2,1) 100 


(2,2,2) 97 (3,2,3) 3 


0.107 


0.174 


0.106 


0.8 


(2,2,2) 100 


(2,0,2) 100 


(2,1,2) 79 (3,2,3) 19 


0.071 


0.084 


0.064 



TABLE VL Dimension vectors and NRMSE for Lattice of 4 coupled Henon maps 







Embedding dimensions 


NRMSE 


N 


C 


FNNl 


FNN2 


PEM 


FNNl 


FNN2 


PEM 


512 


0.4 


(1,1,1,1) 100 


(1,1,1,1) 42 (1,0,2,1) 17 


(1,1,1,1) 45 (1,2,1,1) 20 


0.285 


0.363 


0.288 


0.8 


(1,1,1,1) 100 


(1,1,1,1) 40 (1,0,1,2) 17 


(1,1,2,1) 25 (1,2,1,1) 17 


0.314 


0.357 


0.291 


2048 


0.4 


(1,1,1,1) 88 (2,2,2,2) 12 


(1,1,1,1) 88 (1,1,1,2) 7 


(1,2,2,1) 31 (2,1,2,1) 19 


0.229 


0.228 


0.190 


0.8 


(1,1,1,1) 72 (2,2,2,2) 28 


(1,1,1,1) 36 (1,0,2,1) 33 


(2,1,1,2) 27 (2,2,1,1) 23 


0.225 


0.261 


0.163 


8192 


0.4 


(1,1,1,1) 85 (2,2,2,2) 15 


(1,1,1,1) 85 (1,2,1,1) 8 


(1,2,1,2) 46 (2,1,2,1) 45 


0.197 


0.200 


0.137 


0.8 


(2,2,2,2) 86 (1,1,1,1) 14 


(1,2,0,1) 31 (1,0,2,1) 22 


(3,2,3,3) 79 (2,1,2,2) 13 


0.131 


0.209 


0.072 



PEM cannot distinguish the two time series and selects 
with almost equal frequencies vectors of the form (m, 0) 
and (0, to) giving again over-embedding as TV increases. 
Thus PEM does not reveal the coupling structure of the 
underlying system and picks any embedding dimension 
structure among a range of structures that give essen- 
tially equivalent predictions. Here FNN2 seems to de- 
tect sufficiently the underlying coupling structure in the 
system resulting in a smaller total embedding dimension 
that gives however the same level of prediction as the 
larger M suggested by FNNl and slightly smaller than 
the even larger M found by PEM. 

E. Lattices of coupled Henon maps 

The last system is an example of spatiotemporal chaos 
and is defined as a lattice of k coupled Henon maps 
{xi,n,yi,n}i=i specified by the equations 

= 1.4 - ((1 - C)x,,n + £i^i^l^^:^^i±l^)^ + y,^„ 

The connection of the k maps is restricted between adja- 
cent maps in the ordered list of maps, i.e. each Xi^n, i — 
2,...,fc — 1 is connected to Xi+i^n and Xi-i^m with the 
"boundary" maps for i = 1, fc being simple Henon maps. 
As in the case of driven-rcsponse Henon maps the com- 
plexity of the system and the nature of the dependence 
of the time series with each other is determined by their 
coupling strength, which here is fixed for all couplings in 
the lattice. The results for coupling strengths C=0.4 and 
0.8 and two lattice structures for fc=3 and k—A are given 
in Tables |V] and IVIl respectively. 



For both lattices the results are similar to that of the 
driven-response Henon maps. For A; = 3, PEM does not 
single out an m structure for small sample sizes or for 
moderate coupling, where FNN2 generally does. PEM 
again gives the best prediction results and FNN2 is more 
conservative giving always the smallest M of all three 
methods. The lattice involving 4 maps is a more com- 
plicated system since there are more possible embedding 
dimension vectors for a given M and thus there is more 
diversity in the results. We note that for almost all com- 
binations of data size, coupling strength and method 
(meaning FNN2 and PEM) there are multiple selected 
optimum dimension vectors. Beyond this, the differences 
in FNN2 and PEM discussed for fc = 3 persist also for 
A: = 4. 



V. DISCUSSION 

There does not seem to be an optimal scheme for state 
space reconstruction from multivariate time series. The 
simulation results on two schemes proposed in this work, 
one based on unfolding the attractor at any possible di- 
rection (FNN2) and the other aiming at optimizing pre- 
diction performance (PEM), seem to confirm this. 

When the goal of state space reconstruction is to make 
predictions, selection of multivariate embedding with the 
prediction criterion PEM is best, but this results often to 
over-embedding (large total embedding dimension) and 
does not really estimates the actual degrees of freedom 
of the underlying system. This can also be justified from 
the fact that the dimension of the reconstructed state 
space selected by PEM tends to increase with the sam- 
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pic size, at least for the sizes we used in the simulations. 
Such a feature shows lack of consistency of the PEM cri- 
terion and suggests that the selection is led from factors 
inherent in the prediction process rather than the quality 
of the reconstructed attractor. For example the increase 
of embedding dimension with the sample size can be ex- 
plained by the fact that more data lead to abundance of 
close neighbors used in local prediction models and this 
in turn suggests that augmenting the embedding vectors 
would allow to locate the K neighbors used in the model. 

On the other hand, the two schemes used here that ex- 
tend the method of false nearest neighbors (FNN) to mul- 
tivariate time series aim at finding minimum embedding 
that unfolds the attractor, but often a higher embedding 
gives better prediction results. In particular, the sec- 
ond scheme (FNN2) that explores all possible embedding 
structures gives consistent selection of an embedding of 
smaller dimension than that selected by PEM. Moreover, 
this embedding could be justified by the underlying dy- 
namics of the known systems we tested. However, lack of 
consistency of the selected embedding was observed with 
all methods for small sample sizes (somehow expected 
due to large variance of any estimate) and for the cou- 
pled maps (probably due to the presence of more than 
one optimal embeddings). 

In this work, we used only a prediction performance 
criterion to assess the quality of state space reconstruc- 
tion, mainly because it has the most practical relevance. 
There is no reason to expect that PEM would be found 
best if the assessment was done using another criterion 
not based on prediction. However, the reference (true) 



value of other measures, such as the correlation dimen- 
sion, are not known for all systems used in this study. An- 
other constraint of this work is that only noise- free multi- 
variate time series from discrete systems are encountered, 
so that the delay parameter is not involved in the state 
space reconstruction and the effect of noise is not studied. 
It is expected that the addition of noise would perplex 
further the process of selecting optimal embedding di- 
mension and degrade the performance of the algorithms. 
For example, we found that in the case of the Henon 
map the addition of noise of equal magnitude to the two 
time series of the system makes the criteria to select any 
of the three equivalent embeddings ((2, 0),(0, 2),(1, 1)) at 
random. It is in the purpose of the authors to extent 
this work and include noisy multivariate time series, also 
from flows, and search for other measures to assess the 
performance of the embedding selection methods. 
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